Crystal structure of Prp16 in complex with ADP

Prp16 is a DEAH-box ATPase required for the splicing of pre-mRNA. The X-ray crystal structure of the Prp16-ADP complex was determined at a resolution of 1.9 Å.


Introduction
The spliceosome is large molecular machinery that is responsible for the removal of noncoding introns from eukaryotic precursor messenger RNAs (pre-mRNAs). For each intron removal, the spliceosome is assembled de novo and performs two subsequent transesterification reactions. In the course of each cycle, the components assemble in a stepwise manner and a number of conformational changes take place (Will & Lü hrmann, 2011). The spliceosome consists of five small nuclear ribonucleoprotein complexes (snRNPs) and several other non-snRNP proteins (Wahl et al., 2009). One cycle can be divided into assembly, activation, catalysis and disassembly stages. The first event in each cycle is the binding of the U1 snRNP to the 5 0 -splice site of the pre-mRNA, forming the E complex. Binding of the U2 snRNP to the branch-point sequence and recruitment of the pre-assembled tri-snRNP U4/U6/U5 conclude the assembly step, forming the pre-B complex. For the activation of the spliceosome, the U1 and the U4 snRNPs are displaced while the NineTeen complex (NTC) and the NTC-related complex (NTR) bind, forming the catalytically competent B* spliceosome. Following the first transesterification reaction, the resulting C complex is further remodeled into the C* complex, which can facilitate the second transesterification reaction. Finally, the spliceosome is disassembled, releasing the remaining snRNPs, NTC and NTR, the severed intron lariat and the spliced mRNA . The structural rearrangements of the spliceosome include RNA-RNA, protein-RNA and protein-protein remodeling, which need to be tightly orchestrated and to undergo rigorous quality-control steps, as misspliced mRNA can be linked to diseases such as Duchenne muscular dystrophy (Takeshima et al., 2010) and spinal muscular atrophy (Lorson et al., 1999). All of these remodeling steps are driven and controlled by the action of RNA helicases belonging to helicase superfamily 2 (SF2; Cordin & Beggs, 2013). The SF2 helicases involved in splicing can be further divided into three subfamilies: DEAD-box, DEAH-box and Ski2-like helicases (Jankowsky & Fairman, 2007).
The protein studied in this work is pre-mRNA-splicing factor 16 (Prp16) and is required for the transition of the spliceosome from the C to the C* state (Wilkinson et al., 2021). Therefore, Prp16 binds the 3 0 -end sequence of the intron RNA and translocates towards the spliceosome, leading to dissociation of the branching factors Yju2 and Cwc25 and thereby enabling exon ligation (Tseng et al., 2011). The translocation also plays a role in splicing-fidelity control in a kind of kinetic proofreading mechanism where Prp16 antagonizes suboptimal substrates and promotes optimal substrates for 5 0 -splice site cleavage (Burgess & Guthrie, 1993;Koodathingal et al., 2010;Koodathingal & Staley, 2013;Semlow et al., 2016). Prp16 belongs to the DEAH-box proteins, which share a highly conserved helicase core formed by two RecA-like domains. These two domains provide the necessary architecture that enables DEAH-box helicases to be functional NTPases (Schwer & Guthrie, 1992). The nucleotide binds at the domain interface, where the eight conserved sequence motifs I, Ia, Ib, II, III, IV, V and VI are located. Motifs I, II and VI are involved in nucleic acid binding of the helicase and motif II contains the eponymous sequence DEAH. While motifs Ia, Ib and IV are necessary to bind and hydrolyze the nucleoside triphosphate, motif III is required to couple NTP hydrolysis to the process of unwinding (Campodonico & Schwer, 2002;Schneider et al., 2004). All DEAH-box helicases have a common C-terminal domain architecture comprising wingedhelix (WH), helix-bundle (HB) and oligosaccharide-binding fold (OB-fold) domains. These domains have a regulatory effect on the hydrolyzation rate of the helicase (Kudlinzki et al., 2012). It has also been shown that this part can serve as a platform for the binding of interaction partners (Cordin & Beggs, 2013). The N-terminal regions of the spliceosomal DEAH-box helicases greatly vary in their length and are the least conserved region (Cordin & Beggs, 2013).
Recent cryo-EM structures (three-dimensional reconstructions) of spliceosomal complexes in the C or C* state from human or yeast contain map areas which can be related to parts of or even complete Prp16 molecules (Wilkinson et al., 2021;Galej et al., 2016;Yan et al., 2017;Bertram et al., 2020;Zhan et al., 2018). All atomic models of the Prp16 structures were modeled based on the structurally related yeast Prp43 in complex with ADP (PDB entries 3kx2 and 2xau; He et al., 2010;Walbott et al., 2010). As the local resolution of the cryo-EM maps at the position of Prp16 ranges between 8 and 15 Å , the derived atomic models of Prp16 provide only limited information about the structure of this helicase.
Here, we report the first crystal structure of Prp16 from the thermophilic eukaryotic ascomycete Cheatomium thermophilum in its ADP-bound state at 1.9 Å resolution. Analysis of the structure shows the same domain architecture as observed for other spliceosomal DEAH-box helicases. Moreover, the interaction pattern of Prp16 with ADP also seems to be conserved within this helicase family. In addition, a new position of the -hairpin could be observed, punctuating its proposed flexibility.

Macromolecule production
The Prp16-encoding gene of C. thermophilum (ctPrp16) was identified by the NCBI BLAST search tool (Sequence ID EGS23320.1). The codon-optimized sequence encoding ctPrp16 residues 302-920 was cloned into the pGEX-6P-3 vector utilizing the BamHI and EcoRI restriction sites. The recombinant protein was expressed using Escherichia coli Rosetta2 (DE3) cells in 2ÂYT medium. The cells were induced with 0.5 mM isopropyl -d-1-thioglactopyranoside at an optical density (OD 600 ) of 0.6 and were incubated at 16 C for 20 h. The cells were harvested and washed with phosphatebuffered saline. Prior to cell disruption via a microfluidizer, the cells were mixed at 4 ml g À1 with a buffer consisting of 50 mM Tris-HCl pH 7.5, 500 mM NaCl, 10 mM EDTA. Subsequently, the lysate was centrifuged at 20 000g and 4 C for 30 min. The supernatant was applied onto a Glutathione Sepharose 4B column (GE Healthcare). A washing step with additional 2 M LiCl was performed to remove bound nucleotides and nucleic acids. Finally, the protein was eluted using 30 mM reduced glutathione. Cleavage of the GST tag was performed by incubation with PreScission protease  Table 1 Macromolecule-production information. The protein was concentrated to 20 mg ml À1 using an Amicon Ultra centrifugal concentrator (Merck). A total of 8 mg ctPrp16 (302-920) could be obtained from 2 l medium. The protein solution was aliquoted into PCR tubes in 105 ml samples, flash-frozen in liquid nitrogen and stored at À80 C for further usage. Macromolecule-production information is summarized in Table 1.

Crystallization
Crystallization was performed by the sitting-drop vapordiffusion method at 293 K. ctPrp16 (302-920) was diluted to 4 mg ml À1 (55.98 mM) with gel-filtration buffer. To obtain ctPtp16-ADP complex crystals, the protein was mixed with a tenfold molar excess of ADP (559.8 mM) and reservoir solution consisting of 20%(v/v) PEG 4000 , 100 mM MES pH 6.5, 5 mM MgCl 2 . The total volume of the drop was 2 ml with a 1:1 ratio of protein and reservoir solution. Crystals were observed after one week. Crystallization information is summarized in Table 2.

Data collection and processing
For cryoprotection, the ctPrp16-ADP complex crystals were transferred to mother liquor containing an additional 10%(w/v) PEG 4000 and 550.8 mM ADP before being plunged into liquid nitrogen. Oscillation diffraction images were collected on beamline P13 at DESY, Hamburg, Germany. Data were processed using XDS (Kabsch, 2010). Datacollection and processing statistics are summarized in Table 3.

Structure solution and refinement
The structure of ctPrp16 (302-920) was solved by molecular replacement using Phaser (McCoy et al., 2007). The structure of ctPrp2 in complex with ADP (PDB entry 6fac; Schmitt et al., 2018) was used as the search model. Initial rounds of refinement were performed using Phenix (Liebschner et al., 2019). Manual model building was performed using Coot (Emsley et al., 2010) and the model was subsequently refined using REFMAC (Murshudov et al., 2011). No I/(I) cutoff was applied during refinement. Final validation of the model was conducted via MolProbity (Chen et al., 2010). All figures were prepared using PyMOL (version 1.8; Schrö dinger). Refinement statistics are summarized in Table 4.

Results and discussion
The spliceosomal DEAH-box helicase Prp16 from the thermophilic ascomycete C. thermophilum (ctPrp16) was crystallized in complex with ADP to mimic a post-catalytic state. The ortholog from C. thermophilum was chosen as proteins from this organism exhibit high thermostability and therefore tend to crystallize better (Amlacher et al., 2011). Several studies of the closely related DEAH-box helicases Prp43, Prp2 and Prp22 have shown that orthologs from C. thermophilum are highly suitable for structural investiga-   Table 3 Data collection and processing.  tions of spliceosomal helicases (Tauchert et al., 2016(Tauchert et al., , 2017Schmitt et al., 2018;Hamann et al., 2019;Absmeier et al., 2020). An N-and C-terminally truncated construct ctPrp16 (302-920) was used for crystallization, as the removed parts are expected to be mainly disordered according to the PredictProtein server (Bernhofer et al., 2021). The crystal belonged to the orthorhombic space group P2 1 2 1 2 1 , with unit-cell parameters a = 55.13, b = 102.14, c = 106.61 Å , = = = 90 ( Table 2). The phase problem was solved via molecular replacement using the structure of ADP-bound ctPrp2 as the search model (PDB entry 6fac). The asymmetric unit contains one ctPrp16 molecule. The polder omit map clearly reveals the presence of ADP and a magnesium ion in the active center (Supplementary Figs. S2 and S3). The reported structure of ctPrp16 in complex with ADP was determined at a resolution of 1.9 Å and was refined to an R work of 19.3% and an R free of 23.8%. According to the Ramachandran plot, 96.93% of all residues are in the most favored region, 3.07% are in the allowed region and 0% are outliers (  (Fig. 1). While the RecA domain architecture is shared by all members of the SF2 helicase family, the three C-terminal domains are characteristic of DEAH-box helicases. A prominent -hairpin, which protrudes out of the RecA2 domain and comprises residues 601-620 in ctPrp16, is another typical feature of DEAH-box helicases.

ADP binding of ctPrp16
ctPrp16, which is a member of the SF2 helicase family, possesses a well conserved catalytic core comprising several highly similar structural motifs that are spread over both RecA-like domains. Motifs Ia ( 359 TQPRRVAA 366 ), Ib ( 405 TDGVLLR 411 ) and IV ( 520 LVFMTG 525 ) are reported to be necessary for interaction with the substrate nucleic acid, while motifs I ( 334 GSGKT 338 ), II ( 428 DEAH 431 ), V ( 581 TNIAETSLT 589 ) and VI ( 628 QRAGRAGR 635 ) are needed for binding and hydrolysis of the nucleoside triphosphate. Motif III ( 460 SAT 462 ) couples nucleotide hydrolysis to the unwinding activity. In the structure reported here, an ADP molecule as well as the catalytically required magnesium ion are present at the interface between the RecA1 and RecA2 domains, corresponding to ctPrp16 in the post-catalytic state. The magnesium ion is hexavalently coordinated by four water molecules, Thr342 located within motif I and the -phosphate of the ADP molecule. The -phosphate is furthermore coordiated by interactions with the backbone amides of Gly334, Ser335, Gly336, Lys337 and Thr338 and with the side chains of Lys341 and Thr342. The former group of five amino acids together form motif I. Additionally, four water molecules could be identified in the structure which are involved in the interaction network of the -phosphate, making a total of 13 interactions between this phosphate and ctPrp16 (Fig. 2a). The  Thr589, one water molecule and the side chain of Asp595. The O3 0 atom interacts with the side chains of Asp595 and Arg635. An additional interaction could be observed between one water molecule and the O4 0 atom of the ribose ring. While Arg635 (motif VI) belongs to this conserved region of DEAHbox helicases, Thr593 and Asp595 do not. The adenine moiety does not form any polar contacts with ctPrp16 residues, but the adenine interacts with Phe567 viastacking (Fig. 2b). This interaction seems to be highly conserved as it can be found in all available X-ray structures of spliceosomal DEAH-box helicases bound to ADP, except for one structure (PDB entry 6faa), in which the adenine base is flipped over in a syn conformation (Tauchert et al., 2016;Schmitt et al., 2018;Hamann et al., 2020). The eponymous motif II of this protein class, DEAH, is not directly involved in any interactions between the protein and the nucleotide. Instead, it is involved in the formation of the interaction network of the magnesium ion by interacting with two water molecules via the carbonyl groups of Asp428 and Glu429. Furthermore, Glu131 stabilizes the position of one water molecule near the magnesium ion. Overall, the interactions of the C. thermophilum spliceosomal DEAH-box helicases with ADP seem to be highly conserved as in other ADP-bound helicases: ctPrp2 (PDB entry 6fac) and ctPrp43 (PDB entry 5d0u) exhibit the same interaction pattern (Schmitt et al., 2018;Tauchert et al., 2016).

Position of the b-hairpin
Based on structural comparison with the Ski2-like helicase Hel308, the RecA2 -hairpin has been suggested to be involved in the double-strand separation of DEAH-box helicases (Bü ttner et al., 2007;He et al., 2010;Walbott et al., 2010). In contrast, a crystal structure of Prp22 in complex with ssRNA shows the 5 0 end of the RNA locked in a conformation with the bases pointing away from the -hairpin, positioning a potential double strand on the opposite side of this structural element. Comparison with other helicases suggests that the conformation of the -hairpin observed in Prp16 would support its role as a physical barrier, separating the 3 0 portion of the bound RNA that exhibits a stacked conformation from the 5 0 portion that mainly interacts with the C-terminal domains. Thereby, it is thought to prevent backsliding of the RNA strand during translocation (Hamann et al., 2019).
Superpositioning of all spliceosomal DEAH-box helicase structures from C. thermophilum in different conformational stages reveals the unique conformation of the -hairpin in the ctPrp16 structure (Fig. 3). While all other X-ray structures harbor the -hairpin between the WH and OB-fold domains, independent of the nucleotide-or substrate-loading state, in ctPrp16 the upper part of the -hairpin with its loop region is pushed out of the cleft and interacts exclusively with the OBfold domain. However, this conformation appears to be an artifact caused by interaction with a symmetry mate within the crystal lattice ( Supplementary Fig. S1). The bulky amino acids Glu306 and Phe311 of the symmetry mate, located at the N-terminus of the truncated ctPrp16, interfere with the loop Comparison of the different -hairpin conformations. The structure of ctPrp16 was superposed with different spliceosomal DEAH-box helicases from C. thermophilum in different catalytic states via their RecA1 domains. The different -hairpins are represented in cartoon mode, while the remaining part of the protein is shown as a gray surface (ctPrp16-ADP only). Parts of the OB-fold domain were omitted for clarity. The superposition reveals that the -hairpin of ctPrp16-ADP adapts a more distant conformation compared with the other structures.
which connects the two -strands forming the -hairpin. Arg610 in particular, which is in the middle of the loop region, would clash with the previously mentioned N-terminal residues if the -hairpin adopts a position between the WH and OB-fold domains. In the known complex structures of DEAHbox helicases with bound RNA, there is only one conserved Lys residue of the -hairpin that interacts with the RNA, for example Lys835 in the ctPrp22-RNA complex (Hamann et al., 2019). Superposition of the ctPrp16 structure with the structure of the ctPrp22-RNA complex shows that the conserved Lys is located virtually in the same position in both structures, meaning that RNA binding seems to be unaffected by the unique conformation of the -hairpin in ctPrp16.

Comparison with Prp16 structures emerging from cryo-EM 3D reconstructions
In recent years, five different structural models of spliceosomal C* or C complexes from Saccharomyces cerevisiae and Homo sapiens containing Prp16 have been obtained by means of single-particle cryo-EM (Wilkinson et al., 2021;Galej et al., 2016;Yan et al., 2017;Bertram et al., 2020;Zhan et al., 2018). In the area of the spliceosome which was predicted to harbor Prp16, the authors claimed a local resolution ranging between 8 and 15 Å . Interestingly, in none of the structures was an additional cryo-EM map observed for the 300 N-terminal residues. In fact, the cryo-EM map implicated the presence of the helicase core of Prp16 or only parts of it. The N-terminal region of Prp16 could not be traced despite its function in the spliceosomal context (Wang & Guthrie, 1998). For a direct structural comparison with ctPrp16 in complex with ADP, the cryo-EM 3D reconstructions for which a subnanometre local resolution of Prp16 was reported were chosen (PDB entries 7b9v and 5yzg). For further comparison, the structure of the closely related spliceosomal DEAH-box helicase Prp22 in its respective apo state was also used (PDB entry 6i3o). The overall alignment shows that the RecA1 and C-terminal domains are quite similar in their arrangement, while the location of the RecA2 domain differs dramatically. In contrast to the other structures, the RecA2 domain of ADP-bound ctPrp16 is clearly shifted towards the RecA1 domain (Fig. 4a). To compare the exact distances between the RecA-like domains of each analyzed helicase, the center of mass of each RecA domain was determined and the distance between them Movement of RecA2 upon nucleotide binding. (a) Comparison of Prp16 models originating from X-ray diffraction and cryo-EM and Prp22 (in a nucleotide-free state). The different structures are shown as ribbons and are aligned via their RecA1-like domains. The RecA2 domain of Prp16-ADP is shifted towards the RecA1 domain compared with the other structures. (b) All structures are depicted as semi-transparent cartoon models. The RecA1 domain is colored orange, the RecA2 domain marine and the C-terminal part gray. The centers of mass of the RecA-like domains are displayed as spheres and colored accordingly. In order to calculate the centers of mass for the same sets of atoms, the RecA1 and RecA2 domains were superimposed and the centers of mass of these superimposed domains were determined. Prp16 structures derived from a cryo-EM model adopt a more open conformation that is comparable to the distances of Prp22 in a nucleotide-free state (PDB entry 6i3o) and the mammalian ortholog DHX38 of yeast Prp16 (bottom right). was measured (Fig. 4b). While the structure of ADP-bound ctPrp16 exhibits a center-of-mass distance of 28.1 Å between the two RecA domains, the distances calculated for Prp16 molecules derived from the cryo-EM structures of spliceosomes are increased by about 3 Å (31.1 Å in PDB entry 7b8v and 30.7 Å in PDB entry 5yzg). Therefore, it can be concluded that Prp16 in the spliceosome structures is unlikely to exhibit the ADP-bound post-catalytic state. In fact, the increased RecA distance fits better to the center-of-mass distance measured in the structure of Prp22 in its apo state (PDB entry 6io3, 31.0 Å ). This is in line with the suggestion of Wilkinson and coworkers, who stated that they used the structure of scPrp43 in complex with ADP as the starting model but the corresponding map was more similar to the conformation of Prp22 bound to RNA found in PDB entry 6i3p (Wilkinson et al., 2021; PDB entry 7b9v). Zhan and coworkers also used the structure of scPrp43 bound to ADP as the starting model but recognized that the RecA2 domain is shifted by a distance of 5-8 Å (Zhan et al., 2018; PDB entry 5yzg). Therefore, it can be concluded that the model of Prp16 presented in this work is the first model of Prp16 with subnanometre resolution bound to a nucleotide.